Research on non-time-series data filling methods based on feature evaluation
نویسندگان
چکیده
Abstract With the rapid development of information age, a large amount data is used in popular research areas such as mining. Missing has very serious impact on both process and result mining, it important to find out how fill missing values accurately efficiently. In this paper, we propose method optimally based backpropagation evaluation functions for non-time-series data. Based target value error its own after filling, four classical filling methods, namely mean, interpolation, model prediction, K-nearest neighbor, are considered selection. Finally, single-model padding multi-model weighted schemes compared, results show that with highest fitness selected work best different degrees missingness datasets.
منابع مشابه
Research On Similarity of Stochastic Non-stationary Time Series Data Based on Wavelet-Fractal
Traditional dimension reduction methods about similarity query introduce the smoothness to data series in some degree, but lead to the disappearance of the important features of time series about non-linearity and fractal. The matching method based on wavelet transformation measures the similarity by using the distance standard at some resolution level. But in the case of an unknown fractal dim...
متن کاملTime series prediction based on data compression methods
We propose efficient (“fast” and low memory consuming) algorithms for universalcoding-based prediction methods for real-valued time series. Previously, for such methods it was only proved that the prediction error is asymptotically minimal, and implementation complexity issues have not been considered at all. The provided experimental results demonstrate high precision of the proposed methods. ...
متن کاملResearch on Cassandra Data Compaction Strategies for Time-Series Data
Storage and analysis of time-series data is a subject of intense interest in the current international database research field. Time series data, a sequence of collected data information points by fixing time interval, is an important basis to proceed business analysis and prediction in the future. As an excellent NoSQL database, Cassandra is often used to storage time-series data because of it...
متن کاملQuerying Time Series Data Based on Similarity
ÐWe study similarity queries for time series data where similarity is defined, in a fairly general way, in terms of a distance function and a set of affine transformations on the Fourier series representation of a sequence. We identify a safe set of transformations supporting a wide variety of comparisons and show that this set is rich enough to formulate operations such as moving average and t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of physics
سال: 2023
ISSN: ['0022-3700', '1747-3721', '0368-3508', '1747-3713']
DOI: https://doi.org/10.1088/1742-6596/2425/1/012060